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The Big Five Inventory (BFI) is a self-report measure designed to assess the high-order personality traits 
of Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness. As part of the 
International Sexuality Description Project, the BFI was translated from English into 28 languages and 
administered to 17,837 individuals from 56 nations. The resulting cross-cultural data set was used to 
address three main questions: Does the factor structure of the English BFI fully replicate across cultures? 
How valid are the BFI trait profiles of individual nations? And how are personality traits distributed 
throughout the world? The five-dimensional structure was robust across major regions of the world. Trait 
levels were related in predictable ways to self-esteem, sociosexuality, and national personality profiles. 
People from the geographic regions of South America and East Asia were significantly different in open- 
ness from those inhabiting other world regions. The discussion focuses on limitations of the current data 
set and important directions for future research. 
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Many popular psychological assessment instruments, originally developed in English, 
have been translated into numerous languages and are now commonly used throughout the 
world (e.g., Butcher, Lim, & Nezami, 1998; Nichols, Padilla, & Gomez-Maqueo, 2000). 
Most of these translations were made with an explicit or at least tacit assumption that the 
core psychological constructs assessed by the measures substantively transcend human 
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language and culture. Some researchers have expressed concern with this assumption 
(F. M. Cheung & Leung, 1998; Misra, 1994; Shweder, 1990) and have questioned whether 
the uncritical extension of “Western” ways of thinking to the rest of the world should serve 
as standard practice in psychological science (cf. Church, 2000). Although many of these 
issues remain unresolved (Triandis, 1997), what seems clear is that when psychological 
measures are simply translated from their original English and etically imported “as is" 
into diverse cultures, comparing the assessment results from different cultures becomes 
highly problematic (Brislin, 1993; van de Vijver, 2000). 


PROBLEMS IN COMPARING PERSONALITY TRAIT SCORES ACROSS CULTURES 


For psychologists seeking to investigate personality traits across cultures, one of the more 
vexing problems has centered on whether personality trait scales possess conceptual and func- 
tional equivalence across cultures (Brislin, 1993; Lonner, 1979; Triandis, 1994; van de Vijver 
& Leung, 2000). Particularly troublesome has been establishing whether the mean scores 
across different cultures show metric or scalar equivalence (Byrne & Campbell, 1999; Little, 
2000). That is, when comparing the mean scores of different cultures on a personality trait 
scale, any observed differences may exist not only because of a real cultural disparity on some 
personality trait but also because of inappropriate translations, biased sampling, or the non- 
identical response styles of people from different cultures (Diener & Suh, 2001; Grimm & 
Church, 1999; van de Vijver, 2000). All of these factors can be difficult to fully control, 
making some methodologists extremely skeptical about achieving true metric comparability 
of scores on the same test in different languages or cultures (Heine, Lehman, Peng, & 
Greenholtz, 2002; Poortinga & van Hemert, 2001; van de Vijver & Leung, 1997). Although 
much of this skepticism is certainly warranted, new research methods and analysis strategies 
are emerging that facilitate the comparability of cross-cultural personality data (Allen & 
Walsh, 2000; G. M. Cheung & Rensvold, 2000; Church & Lonner, 1998). 

Among the more common methods for establishing the cross-cultural comparability of 
personality trait measures is to first show that the trait scales contained in the measures are 
internally reliable across all targeted languages and cultures. A second frequently 
employed technique is to demonstrate a high degree of factorial structure invariance across 
different linguistic and cultural contexts (e.g., Caprara, Barbaranelli, Bermudez, Maslach, 
& Ruch, 2000). Third, functional equivalence can be demonstrated by showing the trait 
scales relate to external variables in similar ways. Finally, metric equivalence can be estab- 
lished through differential item functioning analysis and bilingual administrations 
(Ramirez-Esparza, Gosling, Benet-Martinez, Potter, & Pennebaker, 2006). In all instances, 
if psychometric problems are identified with particular scale items or constructs, new 
items or translations are sometimes implemented to improve the comparability of mea- 
sures (Brislin, 1986; van de Vijver, 2000). Historically, if trait scales from a personality 
measure showed high internal reliability, invariant factor structure, similar external corre- 
lates, and item equivalence across different languages and cultures, comparing the mean 
Scores across cultures was often deemed a reasonable next step (see Steel & Ones, 2002; 
van de Vijver & Leung, 2001). However, even with evidence of reliability, factor invari- 
ance, correlational similarity, and item equivalence, problems can remain in how to metri- 
cally interpret mean-level differences in personality traits across cultures (Byrne & 
Campbell, 1999; Heine et al., 2002; Little, 2000). 

Another way to increase confidence in the cross-cultural comparability of personality 
measures is to show that the mean levels of different assessment instruments intended to 
measure the same construct, or approximately the same construct, are highly correlated 
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across multiple languages or cultures. For example, if two conceptually similar personality 
trait scales are used in a large number of different cultures, a positive association between the 
mean levels of those trait scales across the broad set of cultures would provide evidence that 
both measures are tapping the same underlying construct (Campbell & Fiske, 1959; 
Cronbach & Meehl, 1955; Messick, 1995). Of course, to analyze the comparability of per- 
sonality measures using this cross-cultural convergent validation strategy, large numbers of 
cultures must be studied using conceptually similar measures of personality. 

Until recently, this potentially powerful method of cross-cultural or cross-language mea- 
surement validation was rarely employed, mainly because few worldwide personality data 
sets have been available for statistically meaningful comparisons to be made. Most large- 
scale studies of psychological attributes have been primarily interested in social attitudes and 
values, not in stable and enduring personality dispositions. A few items included in these 
worldwide studies may have some relevance for the measurement of personality traits. For 
example, several items of Hofstede's (2001) study of work-related values are interpretable as 
indicators of dispositions toward anxiety or neuroticism (Hofstede & McCrae, 2004). The 
World Values Survey, covering 65 countries representing more than 75% of the world’s pop- 
ulation, contains items (e.g., “Most people can be trusted.) that are somewhat similar to 
those by which personality psychologists usually measure agreeable tendencies toward other 
people (Inglehart & Baker, 2000). Still, studies using full personality trait scales in large 
numbers of languages and cultures have been heretofore quite rare. 


PREVIOUS LARGE-SCALE STUDIES OF PERSONALITY TRAITS ACROSS CULTURES 


One of the first comprehensive personality trait measures to enjoy worldwide popularity 
and a fairly large number of translations into different languages is Eysenck's Personality 
Questionnaire (EPQ; Eysenck & Eysenck, 1975). In 1984, mean-level trait scores from 25 
countries were made available (Barrett & Eysenck, 1984). Ten years later, the number of coun- 
tries in which three broad personality traits—neuroticism, extraversion, and psychoticism— 
were measured by the EPQ was expanded to 37 (Lynn & Martin, 1995). In both published 
reports, the internal reliability and factorial structure of the EPQ across languages and cul- 
tures appeared psychometrically sound. However, because no other large personality data 
sets were available for comparison, it remained unclear as to whether mean-level differences 
in EPQ scores across cultures converged with other similar measures. Again, such cross- 
cultural construct validity evidence would have made it more likely that national differences 
in personality as measured by the three broad trait scales of the EPQ were because of real 
cultural disparities and not some other biasing factors. 

During the past few decades, many personality psychologists, especially those influ- 
enced by the lexical approach to person description (De Raad, 2000; Digman, 1990; 
Goldberg, 1982), have come to view personality traits in terms of five comprehensive 
dimensions, popularly known as the “Big Five" of human personality (see Goldberg, 1990; 
John, 1990). The idea that five dimensions can provide a useful framework for describing 
higher-order differences between individuals has, according to many (see McCrae & 
Costa, 1999; Wiggins & Trapnell, 1997), reached something of a consensus among per- 
sonality trait psychologists (cf. Peabody & De Raad, 2002). The Big Five dimensions of 
personality includes two traits very similar to traits from the EPQ: Extraversion (some- 
times called surgency), which is the degree to which one is active, assertive, talkative, and 
so forth (see Ashton, Lee, & Paunonen, 2002; Lucas, Diener, Grob, Suh, & Shao, 2000; 
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Watson & Clark, 1997), and Neuroticism (vs. Emotional Stability), which is the degree to 
which one is anxious, depressed, irritable, and so forth (see Costa & Widiger, 1994). The 
Big Five framework also includes three additional descriptive dimensions: Agreeableness 
(whether one is generous, gentle, kind, etc.; see Graziano & Eisenberg, 1997), Conscien- 
tiousness (whether one is dutiful, organized, reliable, etc.; see Hogan & Ones, 1997), and 
Openness to Experience or Culture/Intellect (whether one is creative, imaginative, intro- 
spective, etc.; McCrae & Costa, 1997). 

In addition to being considered as merely descriptive dimensions, the Big Five traits 
have also been viewed as causal dispositions within a framework called the five-factor 
model (FFM) of personality (Costa & McCrae, 1992). The FFM conceptualizes each of the 
major dimensions of personality in a slightly different manner than does the Big Five, with 
each of the five broad dimensions composed of six specific facets or subtraits of personal- 
ity. Despite some differences between the Big Five and the FFM, both perspectives con- 
tain trait dimensions that are conceptually very similar to the EPQ traits, providing a 
unique opportunity for the cross-cultural validation of personality trait concepts. 

The most comprehensive instrument thus far designed to measure the Big Five or FFM 
is the Revised NEO Personality Inventory (NEO-PI-R; Costa & McCrae, 1992). Recently, 
the NEO-PI-R was translated into many different languages and administered to samples 
from more than two dozen countries. In 2001, NEO-PI-R data from 26 countries or cul- 
tural regions became available for the research community (McCrae, 2001), and the data- 
base was soon expanded by 10 additional cultures covering five major language families: 
Indo-European, Uralic, Altaic, Dravidian, and Sino-Tibetian (McCrae, 2002). In every cul- 
ture and language that has been studied, the trait scales of the NEO-PI-R have displayed 
adequate levels of internal reliability, and the factorial structure of the NEO-PI-R has been 
considered robust (McCrae, 2001, 2002). 

Direct comparisons of the NEO-PI-R to the EPQ have suggested that translations of 
both instruments provide reasonably comparable estimates of mean levels of Extraversion 
and Neuroticism across cultures. For example, the mean-level scores of extraversion as 
measured by the NEO-PI-R and the EPQ were significantly correlated across 18 nations, 
r(16) = .51, p < .05 (McCrae, 2002). Thus, if a nation scored relatively high on the EPQ 
Extraversion scale, it was likely to score high on the NEO-PI-R Extraversion scale as well. 
These empirical findings, though limited to 18 cultural regions, can be taken as supportive 
evidence that the Big Five dimension of Extraversion can be comparably measured across 
human languages and cultures (see also Goldberg, 1990; Lucas et al., 2000), and it pro- 
vides an indication that the NEO-PI-R may be useful for contrasting and comparing cul- 
tural levels of extraversion (though see Poortinga, Van de Vijver, & Van Hemert, 2002). 


THE PATTERNED DISTRIBUTION OF PERSONALITY TRAITS 
ACROSS NATIONS AND WORLD REGIONS 


Another indicator that mean levels of personality trait scores are comparable is that the dif- 
ferences across cultures demonstrate a systematic pattern of distribution. Although the trans- 
lation quality of the NEO-PI-R varied considerably (McCrae, 2001) and some of the studied 
cultures were represented by very small (fewer than 100) and convenient (e.g., only college 
students) samples, the NEO-PI-R data set provided strong and reliable evidence that the mean- 
level trait scores for different cultures produced meaningful patterns. Namely, the mean-level 
personality trait scores were predictably related to other culture-level indicators, such as 
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Hofstede’s dimensions of culture (Hofstede & McCrae, 2004; McCrae, 2001), and both 
instruments, the EPQ and the NEO-PI-R, demonstrated that personality traits are systemati- 
cally related to external socioeconomic variables such as economic prosperity (Lynn & 
Martin, 1995; McCrae, 2001, 2002). 

In addition, the distribution of personality traits in geographic space seemed to have 
regular, systematic patterns. Neighboring countries tended to have, as a rule, similar per- 
sonality means, and regions separated geographically or historically had less similar 
means on personality trait scales (Allik & McCrae, 2004). Costa, Terracciano, and McCrae 
(2001) reported that gender differences in personality traits demonstrated a geographically 
ordered pattern, with the smallest gender differences evident among Asian and African cul- 
tures and the largest gender differences found in Europe. In addition to mean differences, 
standard deviations revealed a similar geographic pattern: Asian and African cultures were 
characterized by a relatively smaller variability than were European and American cul- 
tures, where heterogeneity of personality traits was the largest (McCrae, 2002). All these 
observations support the position that comparing mean levels of personality traits across 
cultures can be a legitimate enterprise and further suggest that mean levels of personality 
traits may prove a useful tool in understanding the important links between culture and 
psychology (Church & Lonner, 1998; Levine, 2001; Saucier & Goldberg, 2001). 


RATIONALE FOR THE CURRENT INVESTIGATION 


Although the NEO-PI-R is perhaps the most elaborate and widely used instrument for 
measuring the personality traits related to the Big Five, it is only one of a growing family 
of instruments intended to measure the five broadest dimensions of personality. Another, 
briefer, measure of these five dimensions is the BFI (Benet-Martinez & John, 1998; John 
& Srivastava, 1999). Recently, this 44-item self-report inventory was included as part of 
the International Sexuality Description Project (ISDP). The ISDP was initiated and coor- 
dinated by the first author and included convenience samples of around 200 participants 
from 56 nations (see Schmitt et al., 2002). In addition to the main ISDP focus on sexual- 
ity description, the ISDP included the BFI as a measure of personality description. 
Consequently, to our knowledge the ISDP represents the largest cross-cultural data set of 
personality trait scores thus far accumulated. 

Based on the BFI responses from the ISDP, mean levels of personality traits were made 
available from 56 nations, 27 of which overlap with the NEO-PI-R’s smaller set of cul- 
tures. This reasonably large overlapping set of cultures provided a unique opportunity to 
apply the multimethod-multitrait research strategy (Campbell & Fiske, 1959) to the study 
of personality at the level of intercultural analysis where each culture is treated as a single 
subject (for discussion of levels of analysis, see McCrae, 2000). Comparing the BFI to the 
NEO-PI-R would also, for the first time, allow researchers to examine the self-reported 
Big Five dimensions of agreeableness, conscientiousness, and openness using this cross- 
cultural and cross-instrument construct validity technique. 

Overall, there were three main goals to this investigation. Our first goal was to evaluate 
the conceptual equivalence of the BFI across cultures by examining the scale reliability 
and factor structure of the BFI across the 56 nations of the ISDP. This analysis may help 
to determine whether the relatively brief BFI may be of special use in future cross-cultural 
research endeavors. Our second objective was to compare the results of the BFI trait scores 
with scores from two other large cross-cultural personality databases, those in which the 
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NEO-PI-R (McCrae, 2002) and the EPQ (Lynn & Martin, 1995) were used to profile 
national personalities and other personality-related attributes that have been assessed 
across cultures. As part of this goal, we hoped to provide a reasonable degree of confidence 
in the veridicality and metric equivalence of culture-level scores provided by the BFI. Our 
third goal was to document the worldwide distribution of personality traits as measured by 
BFI. Because of the large number of diverse cultures in the ISDP, our plan was to exten- 
sively document significant deviations in personality traits across the major geographic 
regions of the globe. 


METHOD 


SAMPLES 


The research reported in this article is a result of the ISDP, a collaborative effort of more 
than 100 social, behavioral, and biological scientists from 56 nations (Schmitt et al., 2002). 
As seen in Table 1, these 56 nations were grouped into 10 geographic world regions. The 
world region of North America included 4,047 individuals sampled from three nations. The 
nation of Canada was represented by three independent, English-speaking samples from 
the Canadian provinces of Ontario, Alberta, and British Columbia and by a French-speaking 
sample from the province of Quebec. The latter sample was administered the ISDP survey 
as translated and back-translated into French. The translation and back-translation proce- 
dures will be addressed later. All Canadian samples were college students who volunteered 
for the study. Thirteen independent samples were obtained from the United States (n — 
2,793). This included at least one sample from the states of New York, Illinois, Kentucky, 
South Carolina, Florida, Alabama, Texas, New Mexico, Idaho, California, and Hawaii. In the 
sample from Hawaii, 7596 of individuals described themselves as Asian American or Native 
Hawaiian. The samples from mainland United States consisted of 66% European American 
(non-Hispanic), 10% African American, 8% Hispanic American, 5% Asian American, 2% 
Native American, and 9% Other or nondescriptive. The North American world region also 
included one sample from Mexico. The Mexican sample was composed of general commu- 
nity members who volunteered for the study. 

Five cultures from the South American region were included in the ISDP (n = 1,042). This 
included samples from Peru, Bolivia, Chile, Argentina, and Brazil. As seen in Table 1, all of 
these samples were composed of college students. All volunteered for the study. The Chilean 
cultural region included two independent samples; one was not administered surveys con- 
taining explicit sexual questions. All South American samples were administered the ISDP 
survey as translated and back-translated into Spanish, except for the Brazilian sample who 
completed the survey as translated and back-translated into Portuguese. 

Eight cultural regions from Western Europe were represented in the ISDP (n — 2,975). 
This included one sample each from Finland, the Netherlands, Belgium (Flanders region), 
France, and Switzerland (German-speaking region). Multiple samples were collected from 
the United Kingdom, Germany, and Austria. The samples from the United Kingdom, 
Germany, and Austria included both college students and general community members. 
Eleven cultural regions from Eastern Europe were represented in the ISDP (n — 2,795). 
This included one sample each from Estonia, Latvia, Lithuania, Poland, the Czech 
Republic, Slovakia, Ukraine, Romania, Serbia (Yugoslavia), Croatia, and Slovenia. All 
Eastern European samples were administered the ISDP survey in their native languages. 
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TABLE 1 


Sample Sizes, Sampling Type, and Language of Survey Across the 56 Nations of the 
International Sexuality Description Project 


Sample Size 


Cultural Regions Men Women Sample Type Language 
North America 
Canada 373 666 College students English/French 
Mexico 106 109 Community based Spanish 
United States of America 999 1,794 College students English 
South America 
Argentina 110 136 College students Spanish 
Bolivia 92 89 College students Spanish 
Brazil 42 55 College students Portuguese 
Chile 100 212 College students Spanish 
Peru 106 100 College students Spanish 
Western Europe 
Austria 207 260 College/community German 
Belgium (Flanders) 166 356 College students Dutch (Flemish) 
Finland 32 90 Community based Finnish 
France 62 74 College students French 
Germany 294 496 College/community German 
Netherlands 115 126 College students Dutch 
Switzerland 85 129 College students German 
United Kingdom 138 345 College/community English 
Eastern Europe 
Croatia 113 109 College students Croatian 
Czech Republic 106 129 College students Czech 
Estonia 79 109 College students Estonian 
Latvia 90 103 College students Latvian 
Lithuania 47 47 College students Lithuanian 
Poland 309 537 College students Polish 
Romania 123 128 College students Romanian 
Serbia 100 100 College students Serbian 
Slovakia 84 100 College students Slovak 
Slovenia T3 109 College students Slovenian 
Ukraine 100 100 College/community Ukrainian 
Southern Europe 
Cyprus 24 36 College students Greek 
Greece 47 182 College students Greek 
Italy 92 108 College/community Italian 
Malta 133 198 College students English 
Portugal 110 142 College students Portuguese 
Spain 95 178 College students Spanish 
Middle East 
Israel 180 214 College students Hebrew 
Jordan 80 195 College students Arabic 
Lebanon 124 139 College students English 
Turkey 206 206 College/community Turkish 
Africa 
Botswana 97 116 College students English 
Congo, Democratic Republic of the 126 66 College/community French 
Ethiopia 140 100 College/community English 
Morocco 93 89 College students English 
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South Africa 81 81 College students English 
Tanzania, United Republic of 93 43 College students English 
Zimbabwe 100 100 College students English 
Oceania 
Australia 201 288 College/community English 
Fiji and Pacific Islands 81 82 College/community English 
New Zealand 116 158 College students English 
South and Southeast Asia 
Bangladesh 83 62 College students Bangla 
India 100 100 College students English 
Indonesia 55 56 College students Indonesian 
Malaysia 50 91 College students Malay 
Philippines 121 161 College students English 
East Asia 
Hong Kong (China) 100 101 College students English 
Japan 157 102 College students Japanese 
Korea, Republic of 195 295 College students Korean 
Taiwan 116 93 College students Mandarin 


NOTE: Most samples were primarily composed of college students; some included general members of the com- 
munity. All samples were convenience samples. Further details on sampling methods within each culture are 
available from the authors. 


The ISDP had six cultural regions to represent Southern Europe (n = 1,345), including 
Portugal, Spain, Italy, Malta, Greece, and Cyprus. The Malta region included two samples 
of college students. It is important to acknowledge that the placement of cultures into these 
three European regions may be viewed by some as problematic and certainly that more 
than three basic regions exist in Europe, including northern, central, and other divisions. 
However, given the number and geography of nations included in the ISDP, we chose these 
three divisions to economize our presentation while maintaining genuine regional varia- 
tion across the European continent (see also Schmitt et al., 2002). 

Four cultures from the Middle East world region were included in the ISDP (n = 1,344). 
This included two samples from Turkey, one composed of college students and the other 
of general community members. The placement of Turkey in the Middle East region may 
be viewed as problematic in that Turkey could have been placed into several possible cat- 
egories, including southeastern Europe, a Mediterranean, or southwestern Asia. However, 
for comparative purposes using our present geographic groupings, we chose to place 
Turkey in the Middle East world region. One sample from Lebanon was included; these 
were college students who volunteered for the study. Two samples from Israel were 
included; both were composed of college students. One sample from Jordan was included; 
these were volunteer college students who did not receive the full ISDP survey. 

Seven cultural regions from Africa were included in the ISDP (n = 1,325). This included 
college students from Morocco, the United Republic of Tanzania, Zimbabwe, Botswana, 
and South Africa. A sample of both college students and community members was accu- 
mulated from Ethiopia. All of these samples were administered the ISDP survey in 
English, and the Moroccan and Ethiopian samples' surveys contained annotated explana- 
tions for some of the most difficult words and phrases as identified in pretesting sessions. 
A seventh African sample containing both college students and community members was 
accumulated from the Democratic Republic of the Congo. This sample was administered 
the ISDP survey in French. 
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Three cultural regions from Oceania were included in the ISDP (n = 926). This included 
two samples from Australia (one from eastern Australia containing college students and 
one from western Australia that included both college students and community members), 
one sample from New Zealand, and one sample from Fiji. The sample from Fiji was col- 
lected at the University of the South Pacific, a true regional university. Although a large 
number of participants were from Fiji, a significant number came from surrounding 
nations within the Pacific Island region. Consequently, we will refer to this cultural region 
as the Fiji and Pacific Islands region. 

Five cultures from South or Southeast Asia were included in the ISDP (n = 879). This 
included one sample each from India, Bangladesh, Malaysia, Indonesia, and the 
Philippines. Four cultural regions from East Asia were included (n = 1,159), one sample 
each from Hong Kong (now a part of the People’s Republic of China), Taiwan (Republic 
of China), and Japan, and two samples were accumulated from the Republic of (South) 
Korea. For statistical purposes, the cultures of Taiwan and Hong Kong (China) were kept 
separate when conducting nation-level analyses. 

Overall, this collection of cultural regions represented a diverse array of ethnic, geo- 
graphic, and linguistic categories. In total, the many cultures of the ISDP represent 6 con- 
tinents, 13 islands, 29 languages, and 56 nations. Most samples were composed of college 
students (indicated in Table 1 under the sample type column by “college students" or 
"college"); some included general members of the community (indicated by “community 
sample" or “community”). All samples were convenience samples. Most samples were 
recruited as volunteers, some received course credit for participation, and others received 
a small monetary reward for their participation. All samples were administered an anony- 
mous self-report survey. Most surveys were returned via sealed envelope or the usage of a 
drop-box. Return rates for college student samples tended to be relatively high (around 
95%), though this number was lower in some cultures. Return rates for community 
samples were around 50%. 

Not all participants received the full ISDP survey in samples from Chile, Jordan, South 
Africa, Fiji, India, and Bangladesh, though all samples received the BFI measure used in 
this article. Missing data was a problem in some samples, though this was generally 
restricted to measures that dealt explicitly with sexual desire and infidelity—topics not 
addressed in this article. For the BFI, if an individual item was not completed, this resulted 
in the full trait scale being treated as missing data. Further details on the sampling and 
assessment procedures within each of the cultural regions are provided elsewhere (Schmitt 
et al., 2002) and are available from the authors. 


PROCEDURE 


All collaborators were asked to administer an anonymous, 9-page survey to at least 100 
men and 100 women. As seen in Table 1, most national samples reached this approximate 
sample size of men and women. Some nations, such as the United States and Canada, con- 
tained numerous convenience samples, and so the national sample size was much larger 
than 200. All participants were provided with a brief description of the study, including the 
following written instructions: 


This questionnaire is entirely voluntary. All your responses will be kept confidential and your 
personal identity will remain anonymous. No identifying information is requested on this sur- 
vey, nor will any such information be added later to this survey. If any of the questions make 
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you uncomfortable, feel free not to answer them. You are free to withdraw from this study at 
any time for any reason. This series of questionnaires should take about 20 minutes to com- 
plete. Thank you for your participation. 


The full instructional set provided by each collaborator varied, however, and was adapted 
to fit the specific culture and type of sample. Details on incentives and cover stories used 
across samples are available from the authors. 


MEASURES 


Translation procedures. Researchers from nations where English was not the primary 
language were asked to use a translation and back-translation process and administer the 
ISDP in their native language. This procedure typically involved the primary collaborator 
translating the measures into the native language of the participants and then having a sec- 
ond person back-translate the measures into English. Differences between the original 
English and the back-translation were discussed, and mutual agreements were made as to 
the most appropriate translation. This etic procedure tries to balance the competing needs 
of making the translation meaningful and naturally readable to the native participants 
while preserving the integrity of the original measure and its constructs (Brislin, 1980; 
Church, 2001). As seen in Table 1, this process resulted in the survey being translated into 
29 different languages. Samples from Morocco, Ethiopia, Fiji, the Philippines, and Hong 
Kong were administered the survey in English, but certain terms and phrases were anno- 
tated to clarify what were thought to be confusing words for the participants. The transla- 
tion of the ISDP survey into the Flemish dialect of Dutch used only a translation 
procedure, as this involved mainly word variant changes from the original Dutch. In addi- 
tion, pilot studies were conducted in several testing sites, in part to clarify translation and 
comprehension concerns. 


Demographic measure. Each sample was first presented with a demographic measure 
entitled Confidential Personal Information. This measure included questions about gender, 
age, date of birth, weight, height, sexual orientation, current relationship status, socioeco- 
nomic status as a child, socioeconomic status now, area raised (rural, urban, suburban), 
total number of years of education, current religious affiliation, degree of religiosity, eth- 
nic background, and political attitude (conservative vs. liberal). Not all of these questions 
were included in all samples (e.g., date of birth was considered too invasive in some cul- 
tures; some cultures had no concept of suburban), and all collaborators were asked to adapt 
the demographic questions to obtain the most appropriate demographic variables for their 
culture (e.g., ethnicity and religious affiliation categories varied across cultures; political 
attitude terminology varied across cultures). 


Personality trait measure. All samples were administered the BFI of personality traits 
(Benet-Martínez & John, 1998). The 44-item English BFI was constructed to allow quick 
and efficient assessment of five personality dimensions—Neuroticism, Extraversion, 
Openness, Agreeableness, and Conscientiousness—when there is no possibility or need 
for more differentiated measurement of personality facets (Benet-Martínez & John, 1998). 
Self-report ratings are made on a scale from 1 (disagree strongly) to 5 (agree strongly) for 
each of the 44 items. This self-report measure was chosen to be part of the ISDP because 
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of its ease of administration, because of its brevity, and because it has proven useful for 
cross-language and cross-cultural research (Benet-Martinez & John, 1998). 

After all responses were collected from the 56 nations of the ISDP, however, certain 
translation errors became apparent in the BFI. For example, Item 14 from the BFI asks par- 
ticipants to rate whether they see themselves as someone who “сап be tense.” This item 
was mistranslated into German as leicht entspannt reagiert which means "reacting in a 
somewhat relaxed manner" The word entspannt should have been translated using the 
word gespannt, which means “tense.” As a result, responses to BFI Item 14 in the Austrian, 
German, and Swiss samples were reversed before conducting further analyses. A few items 
in other translations required similar reversals, including items from the Ukrainian, 
Malaysian, and South Korean samples. 


Other measures of the ISDP. Participants in the ISDP were asked to complete several 
additional measures, some of which were used in the present analyses. For example, the 
ISDP included a measure of global self-esteem (Rosenberg, 1965) and a measure called the 
sociosexual orientation inventory (Simpson & Gangestad, 1991). Also included were mea- 
sures of adult romantic attachment (Bartholomew & Horowitz, 1991) and multiple tools to 
capture variation in human sexuality, including measures of sociosexuality (Simpson & 
Gangestad, 1991), short-term mating tendencies (Schmitt, Shackelford, Duntley, Tooke, & 
Buss, 2001), a survey of human mate poaching experiences (Schmitt & Buss, 2001), and the 
Sexy Seven trait measure of sexual self-description (Schmitt & Buss, 2000). 


RESULTS 


Not all participants fully completed all measures used in the present study. Consequently, 
we used the following procedure for dealing with missing data. First, any participant who did 
not complete at least 40 of the 44 items from the BFI was eliminated from further analyses. 
This resulted in 429 participants, evenly dispersed across world regions, being removed from 
consideration. The resulting sample of 17,408 participants formed the basis of the remaining 
analyses. Second, when computing scale scores, if a participant was missing more than one 
item from a Big Five scale, the scale was treated as missing data for that participant. This 
caused degrees of freedom to vary across some analyses. 


INTERNAL RELIABILITY AND FACTOR STRUCTURE OF 
THE BFI ACROSS 56 NATIONS AND 10 WORLD REGIONS 


The internal reliabilities of the BFI scales (using Cronbach's itemized alpha coefficient) 
across all cultures were .77, .70, .78, .79, and .76 for Extraversion, Agreeableness, 
Conscientiousness, Neuroticism, and Openness, respectively. The internal reliabilities of the 
BFI scales within each of the 10 ISDP world regions are listed in Table 2. These reliabilities 
were based on all pooled responses across nations within each region. Reliabilities were sub- 
stantial within most regions, though reliabilities did fall below .60 for Extraversion and 
Openness in Africa and for Agreeableness in the South and Southeast Asia. Still, these pre- 
liminary results indicated that the BFI appeared internally reliable across world regions. 

When the raw responses of 17,408 individuals to the 44 items of the BFI were factored 
using principal axis factoring with varimax rotation, a clear five-factor structure was recov- 
erable (Cattell, 1966). A six-factor structure was also discernible. However, the sixth 
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TABLE 2 
Internal Reliability of the Big Five Inventory Scales Across the 10 World Regions of 
the International Sexuality Description Project 


World Region Extraversion _ Agreeableness Сопѕсіепііоиѕзпеѕѕ — Neuroticism Openness 
North America .84 77 AD .82 79 
South America 70 .67 76 74 79 
Western Europe .84 .68 .82 82 79 
Eastern Europe 71 .65 72 {ТО 74 
Southern Europe 74 .67 19 79 76 
Middle East NE .67 AI 76 Л5 
Africa .55 .62 .68 .63 58 
Oceania 82 76 79 82 72 
South and Southeast Asia .64 57 71 77 .68 
East Asia 72 64 73 75 78 


factor was extremely weak and simply consisted of the negative loading items of 
Agreeableness. In total, the first five factors explained 30.8% of the variance. An alternate 
approach is to standardize all scores within culture before conducting factor analyses. This 
procedure reduces the confound of individual and cultural differences (Bond, 2001). When 
this was done, however, very similar results were obtained. 

The worldwide factor structure of the BFI was very similar to the structure of the U.S. 
sample. To compare these two structures, the worldwide varimax matrix (excluding the 
United States) was Procrustes rotated to the U.S. structure (see Table 3). The choice of the 
U.S. structure as a target for Procrustes rotation was based on the fact that the BFI was 
developed in the United States and serves as the standard for the BFI. Even so, from a for- 
mal point of view, no one alignment of axes is preferable to others and any other structure 
could be selected as a reference for comparison. The total congruence coefficient was .98 
and all factor congruence coefficients exceeded .97, indicating the virtual identity of these 
two factor structures. Individual-item congruences (McCrae, Zonderman, Costa, Bond, & 
Paunonen, 1996) also demonstrated good agreement: All coefficients were equal to or 
higher than .92. Thus, the personality structure recovered from the U.S. sample was almost 
identical to the dominant BFI personality structure that can be recovered from a diverse 
sample representing 55 other nations from around the world. 

Next, to study factorial similarity across the major areas of the world in more detail, we 
computed factor structures for 10 geographical regions as grouped in Table 1. Table 4 
reports the congruence coefficients for the Big Five factors after the varimax structure was 
Procrustes rotated toward U.S. structure regarded here as a BFI standard. With a mean con- 
gruence coefficient of .94 across all factors and geographical regions, there was a consid- 
erable degree of congruence among personality structures. Even the lowest value (.84) in 
Table 4 provided a clearly better-than-chance replication of the BFI factor structure. This 
agreement was particularly noteworthy in that the unit of analysis was single items, not 
their aggregates as is typical for the most cross-cultural comparisons. 

Except for Africa and South and Southeast Asia, the factor structures of world regions 
showed congruences that exceeded .90, a value above which factor structures are regarded 
as clearly replicable (Haven & ten Berge, 1977). However, it should be noted that these 
two outlying regional structures do not form a single non-Western personality type, in part 
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TABLE 3 


Factor Loadings for the Big Five Inventory (BFI) Across 56 Nations and 30 
Languages After Procrustes Rotation Targeted to the U.S. Normative Structure 


BFI Items E A C N [0] СОМС 
Is talkative 62 .04 -.04 11 08 1.00 
Is outgoing, sociable 62 24 08 —04 14 .99 
Generates a lot of enthusiasm 46 .20 13 —.05 32 98 
Is full of energy 44 .20 .20 —.14 .26 .99 
Has an assertive personality 33 .03 23 —15 .26 96 
Tends to be quiet -.68 .10 07 04 07 99 
Is shy, inhibited -.53 .08 —.08 26 .09 .99 
Is reserved -.50 .03 .09 12 .07 99 
Is considerate and kind to almost everyone .00 59 15 .05 13 99 
Has a forgiving nature .03 51 -.03 —.05 13 99 
Is helpful and unselfish with others 04 46 16 02 16 99 
Likes to cooperate with others 18 E 12 .03 14 99 
Is generally trusting .09 42 .04 .00 .09 98 
Is sometimes rude to others 11 41 —.12 23 10 98 
Starts quarrels with others 18 38 —.09 22 .04 99 
Сап be cold and aloof —.21 37 -.04 12 AS OT 
Tends to find fault in others .06 32 —.03 27 .08 98 
Does a thorough job 04 .08 59 .05 13 1.00 
Does things efficiently 11 21 57 —.05 .20 99 
Perseveres until the task is finished 02 .08 .53 01 17 1.00 
Is a reliable worker 04 20 52 .08 .12 99 
Makes plans, follows through with them 11 07 51 —.03 14 99 
Tends to be lazy —10 .10 -.54 17 09 1.00 
Tends to be disorganized .00 .03 -.53 12 14 1.00 
Сап be somewhat careless 04 .04 —.46 11 17 98 
Is easily distracted 01 01 —.39 32 .08 98 
Worries a lot —12 .03 —.03 .63 .03 99 
Gets nervous easily .09 .05 .07 .58 —.02 .96 
Can be tense —.08 .06 .04 58 .06 99 
Сап Бе тооду .04 .19 .06 48 11 98 
Is depressed, blue 28 14 14 .46 .03 .99 
Is relaxed, handles stress well .05 13 ‚02 —.57 19 .99 
Is emotionally stable, not easily upset 01 17 11 -.49 14 1.00 
Remains calm in tense situations —.02 .13 15 —.46 .26 99 
Is inventive 18 .05 12 —18 58 98 
Has an active imagination 13 .04 —.03 .05 56 1.00 
Is original, has new ideas 322 ‚02 13 —.14 .55 .99 
Likes to reflect, play with ideas 02 04 ‚07 04 53 99 
Values artistic, aesthetic experiences .00 .09 :01 .08 .52 .99 
Is ingenious, deep thinker .02 .01 19 00 47 99 
Is sophisticated in art, music, or literature .01 ‚01 —.02 —.01 46 .99 
Is curious about many different things 18 .10 .05 -.01 42 99 
Has few artistic interests —.02 .01 .05 .02 -.34 98 
Prefers work that is routine —10 .06 .07 07 -.21 96 
Factor Congruence 99 .99 99 .99 .98 .99 


NOTE: E = Extraversion; A = Agreeableness; C = Conscientiousness; N = Neuroticism; O = Openness; CONG = 


item congruence. Loadings higher than absolute .30 are shown in bold. 


because the factor congruence between Africa and South and Southeast Asia was also not 
very high (.90). In some cases, the reason for poor agreement was a single isolated item 
that may have been poorly translated or not commonly understood, but not always. For 
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TABLE 4 
Congruence Coefficients Between Big Five Inventory Factor Structure 
From 10 World Regions and the U.S. Structure 


World Region E A C N O Total 
North America* 99 98 98 98 .99 .98 
South America .95 .94 .96 .96 97 ‚95 
Western Europe .98 .98 98 96 197 .97 
Eastern Europe .90 .87 .96 OT .96 .93 
South Europe .94 94 98 98 97 96 
Middle East .94 .88 .96 97 ‚95 94 
Africa .88 .93 .88 90 .84 .88 
Oceania .99 98 98 98 .97 .98 
South and Southeast Asia ‚91 .85 .89 95 86 88 
East Asia 94 93 95 94 .95 94 
Average 94 93 96 96 94 94 


NOTE: E = Extraversion; A = Agreeableness; C = Conscientiousness; N = Neuroticism; O = Openness. 
a. U.S. data are excluded. 


example, after eliminating from African data the BFI item “has few artistic interests” (this 
reversed item of the BFI Openness scale had a congruence coefficient as low as .19), the 
average congruence coefficient across all scales (.88), and for Openness in particular (.93), 
did not increase substantially. This and another inverted BFI Openness item (i.e., “Prefers 
work that is routine.") functioned similarly in both South and Southeast Asian (.65) and 
African (.53) samples. 

The aberrant behavior of an isolated item may not be the only cause of these slight dis- 
crepancies. In some other cases, for example Conscientiousness in South and Southeast Asia 
and Africa, the primary loadings on the appropriate factor were high enough, but there were 
loadings of BFI items from other scales that were incongruous with the target structure. 
Nevertheless, the generalizability of the factor structure across cultures was sufficient to pro- 
ceed to the next step, evaluating the convergent validity of culture-level scores. 


THE COMPUTATION OF STANDARDIZED PERSONALITY TRAIT SCORES FOR 56 NATIONS 


To maximize the comparability of personality profiles across the 56 nations of the ISDP, raw 
Big Five scale scores for each nation were converted to standardized T-scores (see Table 5). 
T-scores were considered preferable because they are relatively easy to interpret, always hav- 
ing an overall mean of 50 and a standard deviation of 10. In this case, 7-scores were computed 
by first standardizing the raw national scores around the U.S. average for each of the Big Five. 
That is, the U.S. score was subtracted from each national score, and this result was divided by 
the U.S. standard deviation. Afterward, these U.S.-standardized national scores were multiplied 
by 10, and then 50 was added. Similarly, national standard deviations were converted by divid- 
ing each deviation by the U.S. average and multiplying the result by 10. 

Although this procedure may appear unnecessarily ethnocentric, the United States was 
used to standardize scores for two compelling reasons. First, previous studies involving the 
Big Five have standardized scores using the United States as a reference point (e.g., 
McCrae, 2002). Consequently, this procedure maximized the comparability of our ISDP 
findings with research previously reported in the literature. Second, if the worldwide aver- 
age from the ISDP were used to standardize Big Five scores, the nations particular to the 
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ISDP would influence the resulting 7-scores in a way that would make future comparisons 
to studies involving a different set of nations incomparable. In short, using the United States 
as a standard provided the most reliable means for conducting cross-cultural comparisons. 

One avenue for evaluating whether the nation-level 7-ѕсогеѕ of the BFI represent gen- 
eralizable constructs is to look at the correlations of mean male T-scores for each nation 
and mean female T-scores for each nation. If men’s and women’s T-scores for each of the 
Big Five were correlated across cultures, this would speak to the generalizability of the 
nation-level scores across gender groups. We found that men’s Extraversion levels were 
significantly correlated with women’s Extraversion levels across the 56 nations of the 
ISDP, r(54) = .52, p « .001. Even stronger evidence of generalizability was found for lev- 
els of Agreeableness, r(54) = .82, p « .001, Conscientiousness, r(54) = .80, p < .001, 
Neuroticism, r(54) = .66, p < .001, and Openness, /(54) = .69, p < .001. Thus, it was appar- 
ent that whatever the BFI is measuring across cultures, it does so with regularity across the 
genders of each culture. 


CORRELATIONS AMONG PERSONALITY TRAIT SCALES FROM THE BFI AND THE EPQ 


Lynn and Martin (1995) published the mean scores of Neuroticism, Extraversion, and 
Psychoticism as measured by the EPQ for 37 countries. Of these countries, 24 overlapped 
with the set of 56 nations included in the ISDP. EPQ scores for additional 2 countries 
(Poland and Zimbabwe) reported by van Hemert, van de Vijver, Poortinga, and Georgas 
(2002) increased the overlapping set of nations to 26. Although the EPQ and the BFI do 
not conceptualize Neuroticism and Extraversion in a completely identical way, they do 
share common core themes (see Costa & Widiger, 1994; Watson & Clark, 1997). As 
expected, the Neuroticism scales of the two instruments were significantly correlated, 
r(24) = .49, р = .01. The correlation between BFI Extraversion scale and its EPQ counter- 
part was disappointingly low and did not reach statistical significance, r(24) — .18. For 
comparison, the correlation between NEO-PI-R Neuroticism and Extraversion domains 
with their EPQ counterparts in the set of 18 overlapping cultures described earlier were 
both significant, .80 and .51, respectively (McCrae, 2002). One possibility for the low 
association of the BFI and EPQ scales compared to the NEO-PI-R and EPQ scores is the 
greater diversity of cultures included in the ISDP. It may be that individuals from some 
non-Western cultures, of which there are more in the ISDP, respond to extraversion scales 
differently than most Western cultures. Still, this poor correlation is cause for concern with 
regard to the validity of the BFI Extraversion scale. 


CORRELATIONS AMONG PERSONALITY TRAITS SCALES 
FROM THE BFI AND THE NEO-PI-R 


Table 6 shows correlations between the BFI and the NEO-PI-R domains in an overlap- 
ping set of 27 cultures (i.e., Austria, Belgium, Canada, Croatia, Czech Republic, Estonia, 
France, Germany, Hong Kong, India, Indonesia, Italy, Japan, Malaysia, the Netherlands, 
Peru, Philippines, Portugal, South Africa, South Korea, Spain, Switzerland, Taiwan, 
Turkey, United States, Yugoslavia, and Zimbabwe). Among 36 cultures studied by McCrae 
(2002), two cultures (South Africa and India) were represented by two separate samples. 
For comparisons with the BFI, the Black and White NEO-PI-R samples of South Africa 
were averaged, and India was represented by the Marathi NEO-PI-R sample as the other 
Indian sample was composed of adolescents (see McCrae, 2002). 
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TABLE 6 


Correlations Between the Big Five Inventory (BFI) Factor Scales and the Revised 
NEO Personality Inventory (NEO-PI-R) Domain and Facet Scales 


BFI 
NEO-PI-R E A C N (0) 
Extraversion (E) 43* 32 46* —24 Wasser 
Agreeableness (A) —.33 22 —.07 —.10 —.33 
Conscientiousness (C) .30 .58** .45* —.6] *** 21 
Neuroticism (N) —.30 —4]** —.23 .45* —.23 
Openness (O) 20 —.09 —.12 —.07 27 
Е1: Warmth .43* .39* 43* —.35 42* 
E2: Gregariousness 38 14 34 -11 .69*e* 
E3: Assertiveness 36 .30 .29 —.27 .54** 
E4: Activity .44* .28 47** —.38* qt 
E5: Excitement seeking 24 34 .39* —.06 58** 
E6: Positive emotions 49** 36 —17 —37 «ГЛ 
А1: Trust - П 24 —.18 —.07 —.17 
A2: Straightforwardness —.26 10 —.05 .06 -.31 
A3: Altruism 35 .39* ATH —.35 54** 
A4: Compliance -.35 12 —13 01 —ST** 
A5: Modesty —.15 24 14 —.19 25 
A6: Tender mindedness —.15 27 14 —.30 09 
С1: Competence .43* S9 S3** —.38* ai aia 
C2: Order .18 61 33 —.51** 05 
C3: Dutifulness 11 .40* 22 —:57%*% 28 
C4: Achievement striving 43* 37 41* —.50** 28 
C5: Self-discipline .26 55*** .38* —.50** 34 
C6: Deliberation .02 27 11 —.60*** —.21 
N1: Anxiety —45* —44* —24 63** —.27 
N2: Angry hostility .00 -.25 .03 24 .00 
N3: Depression —.60*** —.41* —.30 61*** —.38* 
М4: Self-consciousness —.36 —.25 —.32 28 —52** 
N5: Impulsiveness .13 —.26 —.05 25 31 
N6: Vulnerability —.39* —.72*** —.53** 719% —.56** 
O1: Fantasy .19 —.29 —.10 .13 .36 
02: Aesthetics 23 —.04 —.09 —.08 .19 
O3: Feeling 32 —.03 —.05 —.11 44* 
O4: Action 21 21 16 —24 21 
OS: Ideas 23 29 14 —.33 28 
O6: Values .02 —.19 —.11 .02 27 


NOTE: N = 27. The cross-cultural convergence correlations are shown in bold. 


*p < .05. **p < .01. ***p < 001. 


The results reported in Table 6 demonstrate cross-instrument correlations across all five 
high-order domains of personality and all 30 NEO-PI-R personality facets. Previous 
reports have demonstrated convergent validity for Neuroticism and Extraversion personal- 
ity scales at the intercultural level of analysis (McCrae, 2002). The convergent validity of 
BFI nation-level scores on Extraversion, r(25) = .44, p « .05, and Neuroticism, r(25) = .45, 
р < .05, were confirmed in this study as well. The Extraversion correlation is particularly 
reassuring, given the poor correlation between the BFI and EPQ Extraversion scales. 
Importantly, this is the first attempt to address the cultural-level convergence of assessing 
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the self-reported traits of Agreeableness, Conscientiousness, and Openness. The strongest 
convergent correlations for these traits were found between the NEO-PI-R and BFI mea- 
sures of conscientiousness, 7(25) = .45, p < .01. For both Agreeableness (.22) and 
Openness (.27), the corresponding convergent correlations were positive but failed to reach 
the level of statistical significance. 

The low convergence between BFI Agreeableness and the NEO-PI-R Agreeableness did 
not persist across all the NEO-PI-R facets of Agreeableness, however. For example, the 
NEO-PI-R facet of Altruism (A3) correlated significantly with the BFI Agreeableness 
scale, r(25) = .39, p « .05. Indeed, when a select composite was formed among the four 
NEO-PI-R facets of Trust, Altruism, Modesty, and Tender Mindedness—four facets that 
are more at the conceptual core of Big Five Agreeableness (Graziano & Eisenberg, 
1997)—national levels of BFI Agreeableness correlated significantly with the NEO-PI-R, 
r(25) = .47, p < .01. Still, the discriminant validity correlations for BFI Agreeableness were 
poor when looking at NEO-PI-R Conscientiousness and Neuroticism. In both cases, BFI 
Agreeableness was related to several facets of these other traits. 

Similar to our Agreeableness findings, the relatively low level of convergent validity for 
the BFI Openness scale was not robust across all the NEO-PI-R facets of Openness. For 
example, the NEO-PI-R facet of Feeling (O3) correlated significantly with the BFI 
Openness scale, r(25) — .44, p « .05. When a select composite was formed among the four 
NEO-PI-R facets of Fantasy, Feeling, Ideas, and Values, national levels of BFI openness 
correlated significantly with the NEO-PI-R, 7(25) = .41, p < .05. Still, the discriminant 
validity correlations for BFI Openness were very poor, especially when comparing BFI 
Openness to NEO-PI-R Extraversion. These shortcomings in discriminant validity will be 
addressed later. 

Additional cross-cultural validity evidence can be gleaned by comparing the standard 
deviations across cultures. An overall standard deviation index was computed as the average 
across all five BFI scales. When this BFI variability index was compared to a similar index 
derived from the NEO-PI-R results, the two indexes of personality variability were margin- 
ally correlated across cultures, 7(25) = .36, p = .06. However, after the elimination of 
Estonia's SD as a probable outlier (the largest SD value in the whole NEO-PI-R set; McCrae, 
2002), the correlation between these two sets became significant, /(24) = .43, p < .05. 


LIMITATIONS AND PROBLEMS WITH ESTABLISHING 
CROSS-INSTRUMENT OR CROSS-CULTURAL VALIDITY 


Albeit significant, these cross-cultural convergent correlations between the BFI and the 
NEO-PI-R domain scales are noticeably smaller than cross-instrument convergence at the 
individual level (i.e., when the same individuals simultaneously complete both measures). 
At the individual level, even the smallest convergent correlations typically exceed the .60 
level with the BFI scales (Benet-Martínez & John, 1998). Apparently, biases and mea- 
surement errors prevented the convergent correlations between two measurement instru- 
ments from being more substantial at the intercultural level. There are at least four obvious 
candidates for the lower than expected cross-instrument convergent correlations at the 
intercultural level. 


Sampling. Some of the national samples, those studied using the BFI and those studied 
using the NEO-PI-R, were represented by relatively small samples and were certainly not 
representative of entire national populations. In the BFI study, for example, France was 
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represented by one of the smallest samples, only 136 college students. This sampling limita- 
tion may be one of the reasons why France curiously received among the lowest (T = 45.44) 
score on the BFI Extraversion scale. According to previous studies on personality and social 
psychological attitudes, the French population has demonstrated no inclination toward 
extreme introversion. Instead, the France mean score is typically located close to the middle 
of the Extraversion distribution (McCrae, 2002). For example, in the NEO-PI-R sample 
France was the 15th from the top among 36 cultures (McCrae, 2002). Also, the French EPQ 
score of Extraversion was quite average, rather close to the midpoint of distribution (Lynn & 
Martin, 1995). It is also indicative of sampling problems with the French ISDP sample that 
after eliminating French data the correlation between the BFI and NEO-PI-R Extraversion 
scales increased considerably, from /(25) = .44, p < .05, to r(24) = .52, p < .01. Thus, some 
of the studied samples may be problematic, and their results may diminish the observed 
interinstrumental correlations. 


Standardization. 'Typically, findings from the NEO-PI-R are normalized in respect to age 
and gender. Across cultures, women generally score higher than men on the NEO-PI-R 
scales of Neuroticism and Agreeableness, and college-age men and women tend to score 
higher on the NEO-PI-R scales of Openness, Neuroticism, and Extraversion and lower on 
Conscientiousness, than do older individuals (McCrae, 2002). To correct for age and gender 
differences, raw NEO-PI-R scores are standardized with respect to U.S. age and gender 
norms. In contrast, the BFI trait scores from the ISDP were not standardized. Fortunately, the 
whole ISDP sample was relatively well balanced with regard of sex and generally homoge- 
neous with regard of age. In addition, analyses using sex- and age-normalized BFI scores 
produced results similar to those reported here. 

Nevertheless, some observed cross-cultural differences might have been caused, at least 
in part, by differences in the sample mean age. For example, among three German-speak- 
ing cultures, Switzerland scored higher in openness (T = 52.62) than did Austria (T = 
49.29) and particularly higher than did Germany (T = 47.80). An extensive study of 
German-speaking countries (n — 7,974) has previously shown that the mean-level differ- 
ences between these three countries are normally very small; only 1.196 of the variance in 
openness was explained by the country of participants (Angleitner & Ostendorf, 2000). In 
contrast to the sample from Switzerland, both German and Austrian samples from the 
ISDP contained noncollege samples of adults who typically score lower in openness. The 
mean age of Germans was 27.9 years, whereas Austrians were 26.5 and Swiss Germans 
were only 23.6. Although the mean differences in age were not very large, they may be 
partly responsible for the observed intercultural differences in openness. 


Acquiescence. It is possible that in some cultures people have a stronger tendency to 
agree with test items regardless of their content—a response bias known as the acquies- 
cence bias. The NEO-PI-R and all its translations minimize the effects of the acquiescence 
bias because all subscales contain roughly equal numbers of positively and negatively 
phrased statements. The BFI, in contrast, may be affected by the acquiescence bias 
because the number of direct and reversed items is not balanced. For example, of the 10 
BFI items designed to capture variation in openness, only 2 are keyed in the opposite direc- 
tion, and only 3 of 8 BFI Neuroticism items are keyed in the opposite direction. Therefore, 
we can expect that after partialling out the acquiescence bias (1.е., by constructing an 
acquiescence index where an equal number of positively and negatively keyed items from 
each of the BFI scales are scored in the same direction), the correlation between the BFI 
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and the NEO-PI-R corresponding scales would improve. Indeed, after controlling for 
acquiescence, the partial correlations increased slightly; in the case of BFI and NEO-PI-R 
Openness scales, the association rose from /(25) = .27, ns, to r(24) = 40, p < .05. 
Agreeableness cross-instrument correlations were also affected by partialling out the 
acquiescence from the BFI, shifting from /(25) = .22 to r(24) = .27. Thus, the acquiescence 
bias was likely one of the causes for the lowered convergent correlations reported earlier 
between parallel instruments across 27 nations. 


Conceptualization. Although the NEO-PI-R and the BFI can be viewed as measuring 
the same broad Big Five personality traits, the way in which they conceptualize each trait 
is slightly different. Almost by definition, the NEO-PI-R has been designed to measure a 
wider array of concepts than BFI. Empirical data from the ISDP would seem to support 
this view. For example, the definition of neuroticism in the BFI seems to be primarily 
related to Anxiety (N1), Depression (N3), and Vulnerability (N6) because the BFI had sig- 
nificant correlations only with these NEO-PI-R facets (see Table 6). This result was hardly 
surprising because among the eight BFI Neuroticism items there are none that ostensibly 
measure the NEO-PI-R facets of Angry Hostility, Self-Consciousness, or Impulsiveness— 
scales that some Big Five theorists tend to place in other domains (see John, 1990; Wiggins 
& Trapnell, 1997). 

In addition to conceptual breadth, in some cases the NEO-PI-R and the BFI seem to 
focus on different aspects or manifestations of the same underlying traits. For example, the 
finding that BFI Agreeableness was related to the NEO-PI-R Extraversion facet of warmth 
(E1), r(25) = .39, p « .05, may reflect the fact that the NEO-PI-R includes in Extraversion 
features of prosociality and interpersonal closeness that the BFI tends to place within the 
trait of Agreeableness (see also Graziano & Eisenberg, 1997). Differences in conceptual- 
ization are particularly obvious in the case of the BFI Openness scale. Judged on the basis 
of ISDP intercultural correlations, the way in which Openness is defined in the BFI is 
much closer to the NEO-PI-R definition of Extraversion than NEO-PI-R Openness. The 
correlation between BFI Openness and NEO-PI-R Extraversion was extremely high, r(25) 
= .73, p « .001, and correlations of BFI Openness with facets of NEO-PI-R Extraversion 
ranged from .42 to .71, all of which were significant (see Table 6). Interestingly, the same 
tendency was noticeable at the individual level of analysis—where individuals from the 
same culture are administered both tests at the same time. For example, the English BFI 
Openness scale is rather strongly correlated, r(160) = .44, p « .001, with NEO-FFI 
Extraversion (Benet-Martínez, personal communication, March 2002). 

These examples seem to suggest that both instruments are measuring basically the same 
spectrum of personality traits, but their categorization of this spectrum is slightly different 
(Poortinga et al., 2002). To test this hypothesis, we performed canonical analysis between 
the five BFI and the five NEO-PI-R domain scales. We found that the canonical correla- 
tion was remarkably high, R = .91, (25) = 54.28, р = .001. Thus, even at the intercultural 
level of analysis, these two instruments were highly redundant. The redundancy of the first 
(BFD set of measures given the second (NEO-PI-R) set was 57.8%, and the redundancy of 
the second (NEO-PI-R) set of measures, given the first (BFI) set, was 48.0%. Because suc- 
cessively extracted canonical roots were uncorrelated, to arrive at a single index of redun- 
dancy one can simply sum up the redundancies across all significant roots (Stewart & 
Love, 1968). The first canonical root alone accounts for 66.696 of the redundancy. When 
all significant roots were taken into account, virtually all information about the culture 
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level of personality traits provided by one instrument (95.5%) can be recovered on the 
basis of information that was measured by another instrument. 


ESTABLISHING CROSS-INSTRUMENT OR CROSS-CULTURAL VALIDITY 
BY RELATING BIG FIVE SCORES TO EXTERNAL CRITERIA 


A final avenue for evaluating the cross-instrument validity of national Big Five scores is 
to correlate nation-level scores from different Big Five measures with select external criteria. 
For example, if the national extraversion profiles provided by the EPQ, the NEO-PI-R, and 
the BFI similarly predict nation-level scores of an external variable, such as sexual behavior, 
this would provide evidence of the functional equivalence of the extraversion scales. In addi- 
tion, if intraregional correlations revealed similar patterns to international findings—if extra- 
version were reliably related to sexuality within nations in the same way that extraversion 
relates to sexuality across nations—this would provide evidence that the nation-level scores 
along the Big Five are capturing variability in personality traits that is meaningful at a more 
psychological level (see also Steel & Ones, 2002). Using additional data from the ISDP, we 
were able to provide two cases for evaluating the functional equivalence of international and 
intraregional Big Five scores. 

Participants from 47 nations of the ISDP completed the Sociosexuality Orientation 
Inventory (SOI), a measure of sexual behaviors, emotions, and attitudes (Simpson & 
Gangestad, 1991). Higher scores on the SOI indicate that a person has a more liberal or 
"promiscuous" orientation to sexuality. As is typical for measures of liberal sexual attitudes, 
the sociosexual variation has been shown to be positively related to extraversion-related 
traits (Simpson & Gangestad, 1991; Snyder, Simpson, & Gangestad, 1986). As shown in the 
first data column of Table 7, national levels of extraversion were positively correlated with 
national levels of sociosexuality across the EPQ, NEO-PI-R, and BFI measures of extra- 
version. This finding provides cross-instrument and cross-cultural evidence of the func- 
tional equivalence (i.e., concurrent validity) of national extraversion profiles. Even though 
extraversion is conceived of in a slightly different manner across these three instruments, 
national sociosexuality levels were positively correlated with extraversion across all three 
measures. In addition, extraversion was positively associated with sociosexuality (control- 
ling for sex of participant) within all the world regions of the ISDP, though in Africa this 
association only approached marginal significance, 7(797) = .05, p = .12. Sex of participant 
was controlled for because of the tendency for men to score much higher than women on 
the SOI (Simpson & Gangestad, 1991). Across all individual nations, the mean within- 
nation correlation between extraversion and sociosexuality was .11, with a standard devia- 
tion of .11. These findings suggest that the correlations based on nation-level profiles may 
reflect a psychological phenomenon that also takes place within most nations. As seen down 
the second data column of Table 7, international and intraregional levels of neuroticism 
tended to be unrelated to sociosexuality. This provided some evidence of the discriminant 
validity of nation-level personality profiles. However, similar to the earlier findings, the dis- 
criminant validity of the BFI Neuroticism scale was somewhat poor in that it did correlate 
with sociosexuality. 

Participants from 55 nations of the ISDP completed the Rosenberg Self-Esteem Scale 
(RSES), a measure of global self-esteem. Higher scores on the RSES indicate that a person 
has a higher level of self-esteem. Typically, research from Western cultures has shown that 
individuals with high self-esteem tend to be more extraverted and less neurotic than those 
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TABLE 7 
International and Intraregional Correlations Among Extraversion, Neuroticism, 
and Key External Criteria 


Sociosexuality Self-Esteem 

Extraversion Neuroticism Extraversion Neuroticism 
Eysenck’s Personality Questionnaire 
International .60* —34 40 —27 
Number of nations 14 14 15 15 
NEO-PI-R 
International 72*** —.17 2x 286 
Number of nations 23 23 26 26 
Big Five Inventory 
International .S6*** —.30* 30% —.18 
Number of nations 47 47 55 55 
Intraregional* 
North America .12*** .02 BH —.5 [8 
South America .11** —.06 14**®* pee 
Western Europe 133 —.03 FELIS —.51*%** 
Eastern Europe 19*** —.05* geek — 4D 
Southern Europe 0s .01 Apes — Agee 
Middle East J15** ~.03 peek EI 
Africa 05 00 PLI Lgs 
Oceania 12% —.01 ADR —.5 
South and Southeast Asia 20% —.05 KI LA pee 
East Asia ‚14*** 00 IQ LA See 


a. Partial correlations controlling for sex of participant. 
*p < .05. **p < .01. ***p < .001. 


with low self-esteem (McCrae & Costa, 1990). As shown down the right side of Table 7, 
national levels of extraversion tended to be positively associated with national levels of 
self-esteem. For the EPQ, this association approached marginal significance (p = .14). This 
finding provided cross-instrument and cross-cultural evidence of the functional equiva- 
lence validity of national extraversion profiles. National levels of neuroticism were only 
slightly related to national levels of self-esteem, though this association reached marginal 
significance for NEO-PI-R Neuroticism (p — .07). In addition, extraversion was positively 
associated, whereas neuroticism was negatively associated, with self-esteem within all the 
world regions of the ISDP. Across all individual nations, the mean within-nation correla- 
tion between extraversion and self-esteem was .35, with a standard deviation of .12. Again, 
these findings confirm that the correlations based on nation-level profiles reflect real psy- 
chological phenomena that take place within nations. These findings, taken together, sug- 
gest that national levels of personality traits as assessed by various measures reasonably 
converge in their ability to predict external criteria. 


PATTERNS OF THE BIG FIVE ACROSS 56 NATIONS AND 10 WORLD REGIONS 


The third major objective of this study was to identify any patterns in personality traits 
across the worldwide sample of the ISDP. Looking across all 56 nations of the ISDP, we 
found a statistically significant main effect of nation on BFI Extraversion, F(55, 17,333) = 
9.96, p < .001, т> 2.03, though the magnitude of this effect as indexed by partial eta-square 
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(n?) was only small to moderate in size. According to Cohen (1988), n? is considered small 
if .01, medium if .06, and large above .14. As shown in Table 5, it appeared that the most 
extraverted people tended to live in Serbia and Croatia, whereas the most introverted 
resided in Bangladesh and France. Post hoc analyses (e.g., Tukey's honestly significant 
difference, HSD) confirmed these general national trends. 

We found a significant and moderately sized main effect of nation on Agreeableness, 
F(55, 17,346) = 29.36, р < .001, т> = .09. The most agreeable nations were the Democratic 
Republic of the Congo and Jordan, whereas Japan and Lithuania scored the lowest on 
Agreeableness. Nation had a moderate main effect on the BFI Conscientiousness factor 
scores, F(55, 17,334) = 30.90, р < .001, n? = .09. The top nations in Conscientiousness 
were the Democratic Republic of the Congo and Ethiopia, whereas Japan and South Korea 
scored the lowest. A small to moderate main effect of nation was observed on the trait of 
Neuroticism, F(55, 17,338) = 17.03, p « .001, n? = .05. Table 5 shows that the highest 
national scores on the BFI Neuroticism scale were from Japan and Argentina, whereas the 
lowest national levels of Neuroticism were obtained from Democratic Republic of the 
Congo and Slovenia. Respondents from Chile and Belgium rated themselves as the most 
open to experience, whereas the people of Japan and Hong Kong described themselves as 
extremely low in Openness. The main effect of nation on Openness was also statistically 
significant and moderate in size, F(55, 17,239) = 23.94, р < .001, 1? = .07. 

To determine whether certain patterns or profiles in personality exist across cultures, one 
possibility is not to look at trait means in isolation but simultaneously across the whole per- 
sonality profile. The sum of the squared differences among the five corresponding scores for 
each pair of two nations can be used to characterize the Euclidean similarity of their person- 
ality profiles. Looking at the shortest Euclidean distances between cultures, it was clear that 
many pairs were close to each other in systematic ways. The list of the nearest neighbors 
includes, for example, such pairs as the Democratic Republic of the Congo and Tanzania, 
Botswana and South Africa, Malaysia and Fiji Islands, Germany and Austria, Greece and 
Cyprus, and Latvia and Lithuania. Many of these pairs share the same geographical region, 
history, culture, and ancestry. However, some of the closest neighbors have little in common 
that could easily explain the extreme similarity of their personality profiles. For example, it 
seemed unclear what kind of cultural or historical relatedness outside of pure coincidence 
could bring together Estonia and Mexico, Indonesia and the United Kingdom, Israel and 
Finland—all similar in the Euclidean distance across all Big Five dimensions. 

Nonetheless, several systematic trends in the worldwide distribution of personality 
traits were evident, especially when looking at cultures not in isolation but aggregated over 
the entire geographical world regions listed in Table 1. According to these groupings, the 
least extraverted people tended to live in East Asia. As shown in Figure 1, using raw means 
(not T-scores) and 95% confidence interval error bars, it appeared that the level of 
Extraversion was much lower in East Asia than in most other world regions. A one-way 
ANOVA with world region as the independent variable and Extraversion as the dependent 
variable found a significant main effect of world region, F(9, 17,379) = 20.29, p « .001, 
n? = .01, though the magnitude of this effect as indexed by n? was small. Multiple post hoc 
analyses (e.g., Tukey's HSD) confirmed the significant deviation of East Asia from other 
world regions. Interestingly, South America and South and Southeast Asia were also 
significantly lower on extraversion than the rest of the world. 

World region had a significant main effect on agreeableness, F(9, 17,392) = 101.26, 
р < .001, n? = .05. As seen in Figure 2, nations from Africa scored significantly higher and 
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Figure 1: Extraversion Levels (With 95% Confidence Interval [CI] Error Bars) Across the 10 World 
Regions of the International Sexuality Description Project 


East Asians scored significantly lower than all other world regions. The regions of South 
America, Western Europe, and Eastern Europe were significantly different from all other 
regions as well, according to post hoc analyses. World region had a significant and mod- 
erately sized main effect on Conscientiousness, F(9, 17,380) = 122.84, p < .001, n? = .06. 
As with Agreeableness, the world region of Africa scored higher and the region of East 
Asia scored significantly lower on Conscientiousness than all other world regions accord- 
ing to post hoc analyses (see Figure 3). 

World region had a statistically significant but small main effect on Neuroticism, (9, 
17,384) = 47.45, p < .001, n? = .02. In somewhat of a contrast to the regional trends in 
Conscientiousness, Africa scored significantly lower on the BFI Neuroticism scale, 
whereas East Asia scored higher, than did all other world regions. In addition, Figure 4 
shows that South America and Southern Europe scored higher than did all regions save 
East Asia. Finally, world region had a significant and small- to moderate-sized main effect 
on Openness, F(9, 17,375) = 63.33, p < .001, n? = .03. As shown in Figure 5, the world 
region of East Asia scored significantly lower on Openness than did all other world regions 
according to all post hoc analyses. Interestingly, Africa also scored lower on openness than 
other regions, whereas South America scored significantly higher than did other world 
regions. 
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Agreeableness Levels (9596 CI) 
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Figure 2: Agreeableness Levels (With 95% Confidence Interval [CI] Error Bars) Across the 10 World 
Regions of the International Sexuality Description Project 


Some of these regional personality profiles may seem counterintuitive. In particular, 
stereotypes about national character usually do not portray East Asian cultures (e.g., 
Chinese, Korean, and Japanese cultures) as those where people are in a great deficit of will 
and determination to work hard toward their goals (i.e., low conscientiousness). Our ISDP 
findings challenge not only intuitions about personality stereotypes but also certain rea- 
soned expectations about the relationships between personality traits and objective societal 
indicators. For example, it would seem logical to expect that the economic prosperity of a 
nation would be related to the conscientiousness of its citizens or at least that conscien- 
tiousness would be a favorable factor for economic development. Contrary to this expec- 
tation, the correlation between the BFI factor scores of Conscientiousness and gross 
domestic product (GDP) per capita approached marginal significance in the negative direc- 
tion, (52) = —21, p = .13 (GDP per capita data were taken from the United Nations 
Development Programme, 2000). A similar result was recently found when relating the 
NEO-PI-R Conscientiousness scale to GDP per capita (Steel & Ones, 2002), r(24) = –.68, 
p < .05. These findings provide an interesting example of how the direct assessment of per- 
sonality traits can produce results that run counter to expected culture-personality rela- 
tionships. It remains unclear, though, whether these counterintuitive results should 
challenge the extension of culture-personality links at the individual level to the cultural 
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Figure 3: Conscientiousness Levels (With 95% Confidence Interval [CI] Error Bars) Across the 
10 World Regions of the International Sexuality Description Project 


level or whether the national personality profiles here reflect a problem of metric equiva- 
lence. Simply analyzing mean-level differences across cultures is not the only way to 
determine whether certain regions have particular personality profiles, however. 


BIG FIVE STANDARD DEVIATIONS ACROSS 56 NATIONS AND 10 WORLD REGIONS 


Although the factor score values of the BFI are subject to many different biases (e.g., 
different cultural standards by which a trait is judged), the variability about the averages 
would seem less vulnerable to this particular kind of distortion, though it is still vulnera- 
ble to certain response biases (Au & Cheung, 2004; Chan, Gelfand, Triandis, & Tzeng, 
1996). Examination of the intercorrelations of BFI domain standard deviations reveals that 
the magnitude of variance is consistent across facets. Like regularities previously reported 
in the personality literature (McCrae, 2001), cultures that had high SDs for some domain 
of personality tended to have high SDs for all other domains or facets as well. In the cur- 
rent sample of 56 nations, intercorrelations of SDs were significant ranging from .39 to .67 
(with the mean r of .50). Because variability was generalizable across domains, a mean SD 
was calculated and standardized over the five domains as described earlier (see Table 5). 
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Neuroticism Levels (95% Cl) 
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Figure 4: Neuroticism Levels (With 95% Confidence Interval [CI] Error Bars) Across the 10 World 
Regions of the International Sexuality Description Project 


The range of the aggregated variability was substantial from Malaysia (SD — 6.62) to 
Mexico (SD = 10.49). Like the NEO-PI-R data (McCrae, 2002), most of the nations from 
Asian and African world regions, with the notable exceptions of New Zealand and 
Australia, were in the lower half of the distribution of mean SDs. The mean SD variability 
was the most conspicuous, in contrast, among European and American countries. It is pos- 
sible that in modern, industrialized societies, the heterogeneity of personality traits is 
larger than in developing nations. Indeed, the mean SD correlated positively with the life 
expectancy, r(52) — .42, p « .01, and per capita GDP, r(52) — .50, p « .001. 


DISCUSSION 


This cross-cultural study of personality traits had three primary objectives. First, we 
evaluated the factor structure of the BFI across diverse forms of human culture. We found 
that the five-dimensional structure found previously (Benet-Martínez & John, 1998) was 
highly replicable across all the major cultural regions of the world. Second, we wanted to 
evaluate the validity of nation-level BFI trait profiles. We found that BFI trait levels were 
reliably related to national profiles previously reported in the literature (e.g., from the 
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Figure 5: Openness Levels (With 95% Confidence Interval [CI] Error Bars) Across the 10 World 
Regions of the International Sexuality Description Project 


NEO-PI-R), particularly when issues of sampling and acquiescence are addressed. Third, 
we attempted to document the worldwide distribution of personality traits as measured by 
the BFI. We found several systematic patterns of personality traits across cultures. We now 
review each of these major objectives in turn, paying close attention to the limitations of 
our findings. 


DOES THE PERSONALITY STRUCTURE OF THE BFI GENERALIZE ACROSS CULTURES? 


After comparing the Spanish- and English-language versions of the BFI, Benet- 
Martinez and John (1998) came to the conclusion that there was little evidence for sub- 
stantial Latin-U.S. cultural differences in personality structure at the broad level of 
abstraction represented by the Big Five. The present study expanded the comparison of the 
BFI structure to another 28 languages and 54 cultures. Although the results of the present 
investigation basically agreed with the conclusions of Benet-Martinez and John—that 
observed cultural differences in personality structure are rather small—there remain some 
important caveats to this general conclusion. 

In the majority of cross-cultural comparisons, the differences in personality structure 
were very small and should probably be ignored. However, in some cases the differences 
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in the patterns of BFI covariation were not totally negligible. There were some nations, and 
entire geographical regions, where the BFI personality structure deviated slightly from the 
dominant personality structure characteristic of most of the world. For example, we found in 
Asia that the BFI structure was somewhat at odds with the U.S. structure. Other researchers 
have noticed that personality traits within this general cultural region— particularly in the 
Philippines—are not always organized exactly in the same manner as is typical of Western 
countries (Guanzon-Lapefia, Church, Carlota, & Katigbak, 1998). Moreover, according to 
some reports the openness domain of personality has not been consistently extracted in China 
(F. M. Cheung et al., 2001). Because Asian cultures tend to be more collectivist (Hofstede, 
2001), a reasonable speculation may be that openness takes on a different form or function 
in more collectivist cultures. 

In previous research, similar problems have occurred when personality traits were mea- 
sured in Africa. For example, researchers failed to find a clear openness factor in Black 
South African cultures (Heaven, Connors, & Stones, 1994; Heuchert, Parker, Stumpf & 
Myburgh, 2000). In a Shona translation of the NEO-PI-R, the Openness scale demon- 
strated the poorest factorial congruence with regard of the U.S. normative structure 
(Piedmont, Bain, McCrae, & Costa, 2002). On the other hand, in one careful examination 
of the generalizability of the FFM to the Philippines—a typical example of a collectivist 
culture (Hofstede, 2001)—it was concluded that the five-factor structure replicated well in 
the Philippines and indigenous or emic inventories added only modest incremental valid- 
ity beyond that of imported or etic instruments (Katigbak, Church, Guanzon-Lapefía, 
Carlota, & del Pilar, 2002). 

The present study may provide some special insight into the problem of whether the 
openness factor of the Big Five is replicable in non- Western nations. Indeed, the ISDP rep- 
resents the largest sampling of African cultures ever conducted in which the Big Five were 
directly assessed and includes 7 separate nations with more than 1,200 individual respon- 
dents. Rather surprisingly, the factor in the ISDP world region of Africa that demonstrated 
the closest resemblance to the U.S. personality structure was openness (.93), the only one 
that exceeded a factor replicability criterion of .90. There may be several explanations for 
the discrepancy between the present study and previous failures to replicate the openness 
dimension in Africa. It is possible, for example, that openness is a concept that is difficult 
to translate into African languages, such as Shona and Xhosa, in which there is a shortage 
of the openness-related terms. Much of the previous research has focused on the psycho- 
logical dimensionality of indigenous or emic single-word descriptors, whereas the BFI 
uses full statements about behaviors, thoughts, and emotions instead of just individual 
terms. In addition, African participants in the ISDP were not studied in their native lan- 
guages and instead were administered either English or French versions of the BFI. 

Overall, we found that the BFI personality structure replicated well across a wide spec- 
trum of cultural regions. The BFI proved to have substantial and robust levels of internal 
reliability, and a five-factor personality structure consistently emerged from principal com- 
ponents analyses. This tended to be true at the level of individual cultures, across all 10 
major world regions and across all nations combined. One major caveat to interpreting the 
importance of these findings is that the 44 BFI items used in the ISDP were predominantly 
imported or etically transported into other cultures. That is, instead of the five factors 
emerging from native conceptions of personhood across all cultures, our findings merely 
confirm that when the 44 English items that form five factors are translated into other lan- 
guages, they retain a five-dimensional structure. Nevertheless, by assessing personality in 
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several cultures not previously examined, this study provides more evidence that the five- 
factor structure of personality is generalizable (cf. Peabody & De Raad, 2002). Thus, the 
ISDP results can be taken as new, though limited, evidence that the Big Five dimensions 
of personality can be meaningfully measured across human cultures. 


DO BFI SCORES CONVERGE WITH OTHER MEASURES OF 
THE BIG FIVE ACROSS CULTURES? 


The joint administration of Eysenck’s Neuroticism and Extraversion scales with their 
NEO-PI-R counterparts normally produces a substantial converging correlation between cor- 
responding scales (Costa & McCrae, 1995). Previous studies have also obtained a relatively 
high correspondence between the respective scales of these two instruments, the EPQ and the 
NEO-PI-R, when they have been administered separately to unrelated groups of individuals 
and thereafter averaged into single indicators of whole cultures (McCrae, 2002; though see 
Poortinga et al., 2002). Before the present ISDP findings, there had been no information 
about the convergent validity of self-reported measures of openness, agreeableness, and con- 
scientiousness at the level of intercultural analysis. This investigation filled this knowledge 
gap and we now have evidence that two independent measures of the Big Five (the BFI and 
the NEO-PI-R) demonstrate reasonable cross-cultural agreement, particularly when issues of 
sampling, standardization, and acquiescence are addressed. Cultures that scored high on a 
personality trait as measured by the BFI tended also to score high on that trait as measured 
by the NEO-PI-R. According to both the BFI and NEO-PI-R, for example, Japan’s level of 
neuroticism was among the highest of all cultures, and according to the EPQ, Japan’s neu- 
roticism was the third highest (Lynn & Martin, 1995). 

The significant level of agreement that we uncovered between parallel personality mea- 
sures across cultures is rather remarkable when one considers the many hurdles to obtain- 
ing accurate multiculture and multi-instrument comparisons. There are inevitably 
problems with individual instrument translations, the unrepresentativeness of samples, 
response biases and scale variations, and slightly different definitions of the Big Five 
across the BFI, the NEO-PI-R, and the EPQ. Controlling for at least some of these biases 
and measurement errors (e.g., the acquiescence bias) considerably improved the agreement 
we observed among scales. Future investigations that control for other confounding factors 
or that follow up our results using larger or more representative samples may help clarify 
why a few cultures tended to score differently across measures (e.g., France’s conspicu- 
ously low BFI Extraversion score). 


NATIONAL PROFILES OF PERSONALITY TRAITS: A RETURN 
TO CULTURE AND PERSONALITY? 


Throughout the history of psychology, the idea that people within certain cultures pos- 
sess enduring dispositional differences has fallen in and out of favor over time, with early 
attempts to portray “national character” suffering from serious methodological flaws (for 
a review, see Levine, 2001). Currently, there is a revival of interest in understanding the 
links between culture and personality (Church, 2001; McCrae, 2000). In the case of the 
Big Five traits, it seems probable that certain biases and measurement errors of individual 
assessment devices reduce the accuracy of mean-level portrayals of national personality. 
Based on classic test theory (Nunnally & Bernstein, 1994), this error likely attenuated the 
convergent correlations observed between the BFI and related measures. In that context, 
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the general level of agreement we found among personality trait measures may provide a 
reasonable justification for taking the national trait scores of the respective measures and 
finding an overall Big Five profile for individual nations. Moreover, it seems likely that 
these biases and errors would be less damaging to the averaged ranking of nations across 
multiple Big Five instruments. Consequently, we created national rankings on each of the 
Big Five dimensions according to the BFI and the NEO-PI-R and by averaging across the 
BFI and NEO-PI-R (e.g., if Nation A is first on Extraversion according to the BFI and 
Nation À is third on Extraversion according to the NEO-PI-R, this would result in an aver- 
age ordinal ranking of second for Nation A on Extraversion). In so doing, we uncovered 
several distinctive patterns and geographic regularities in personality traits across cultures. 

For example, South American and European countries tended to occupy the top ordinal 
positions of the openness dimension, with Chile ranking in first place among all cultures 
of the ISDP (detailed rankings are available from the authors). On the other hand, the bot- 
tom of the openness rankings belonged to mostly East Asian cultures, such as Hong Kong, 
Japan, South Korea, and Taiwan. In other words, according to BFI-based Big Five rank- 
ings, people from South America and Europe are more open than are people from East 
Asian cultures about their surrounding world and themselves and are more willing to 
entertain novel ideas and unconventional values. Rankings on other Big Five dimensions 
produced similar kinds of contrasts, such as the finding that people from African cultures 
tended to be low in anxiety and depression (i.e., low in neuroticism). 

Some of these personality rankings are in a sharp contrast with the national stereotypes 
that people have about their own country or other nations (Peabody, 1985; Terracciano et 
al., 2005). However, to our knowledge no convincing evidence has demonstrated that 
beliefs about national character are, despite their wide adoption and resistance to change, 
entirely veridical. Rather, they may simply be examples of collectively shared myths and 
empirically there may be no distinctive national character (McCrae, 2001). Often, real dif- 
ferences in means and rankings are too small to be noticed by the naked eye, especially 
compared to the interindividual variation inside each culture, which almost always con- 
siderably exceeds the former. Even experts of cross-cultural psychology find it difficult 
when asked which personality factor is lowest among Hong Kong Chinese and South 
Koreans but highest among Norwegians and Americans. In one study, eight prominent 
cross-cultural psychologists were unable to identify these factors at a better than chance 
level (McCrae, 2001). 

It is possible, of course, that the cross-cultural trait differences, measured by personal- 
ity instruments such as the BFI and the NEO-PI-R, do not reflect people's enduring dis- 
positions to think, feel, and behave in certain ways but are instead culturally endorsed 
styles of responding to personality questionnaires (Johnson, Kulesa, & Cho, 2005; Smith, 
2004; van Herk, Poortinga, & Verhallen, 2004). As a general proposition, however, we 
believe this is unlikely to be the case. Personality trait measures have been shown to pos- 
sess five major dimensions not only based on individual responses but also from group 
data, where each culture was represented as a single subject by their mean scores (McCrae, 
2001). In other words, if cultural mean scores represent little other than a response style or 
bias, it would be unlikely for their cross-cultural correlational manifold and exact structure 
to be equivalent to that derived from individual data. No one has proposed or elaborated a 
theory of response biases with five interpretable, orthogonal factors. We would argue that 
a more realistic explanation is that response styles play a role in self-reported personality 
but are largely confounded with “true” Big Five personality indicators. 
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LIMITATIONS AND FUTURE RESEARCH DIRECTIONS 


One of the main challenges for cross-cultural personality psychology in the future will be 
to separate out those factors—including biases, translation inadequacies, and response 
styles—that covary with substantial personality differences but themselves can also affect the 
mean trait scores of nations (Church & Lonner, 1998). Biases and different response styles, 
provided that they exist, are by themselves valuable sources of information about cross- 
cultural differences (Heine et al., 2002; Johnson et al., 2005; Tsai, Knutson, & Fung, 2006). 
For the construction of truly universal and cross-culturally transportable personality instru- 
ments, however, it will be necessary to measure these biases and response styles to take them 
into account and improve the construction of culture-fair measuring instruments. 

With this perspective in mind, the observed cross-cultural differences in BFI Conscien- 
tiousness may have been detrimentally confounded by different response styles. It is per- 
haps not entirely surprising that Americans presented themselves as highly conscientious, 
as they are known for working longer hours than many other cultures (Peabody, 1985). 
However, it seems less obvious why individuals from Ethiopia, Tanzania, and Zimbabwe 
would end up occupying the highest places in the list of the most conscientious nations (as 
compiled by rankings of BFI and NEO-PI-R scale scores). It is equally surprising to see 
Chinese, Korean, and Japanese people in the very bottom of the same list. It seems unlikely 
that most people would think of individuals from these cultures as extremely undisciplined 
and weak willed—a profile indicative of low conscientiousness. 

One possible explanation is that conscientiousness is estimated with respect to cultural 
norms. That is, certain norms may establish how punctual, strong-willed, and reliable 
people are expected to be in different cultures. Suppose, for example, that there are differ- 
ent cultural standards for being organized, purposeful, and achievement oriented. Let us 
imagine a culture where these standards are set extremely high and almost every effort falls 
short of these almost compulsive demands. Compared with these standards, almost every- 
one is forced to report on a self-report scale that he or she is less organized and determined 
than is generally the case in this particular culture. Perhaps Japan and Korea are prototyp- 
ical examples of this type of cultural response bias. Japanese is, for example, a unique lan- 
guage having a special word referring to death from overwork (karoshi), and in Korea 
unexpected natural death has become the leading compensated work-related cause of death 
(Park et al., 1999). 

In contrast, in many other cultures, prudence, dutifulness, and achievement striving are 
not emphasized as cultural norms. Nobody expects themselves or others to be extremely 
punctual and self-disciplined and to plan their action with caution and consideration. 
Instead, nearly every achievement might surpass relatively modest expectations and could 
be regarded in these cultures as an act of strong will and determination. If this explanation 
is valid, we might speculate that in Ethiopia, Tanzania, and Zimbabwe—three of the top 
conscientiousness countries in the ISDP set of nations—the cultures have developed a 
rather different suite of norms concerning conscientiousness than have been developed in 
Japan, Hong Kong, or Korea. 

This type of norm-related explanation is not uncommon (Heine et al., 2002) and has 
been used to explain another paradox concerning suicide and well-being. Suicide rates, it 
turns out, tend to be higher in those nations that rank high on subjective well-being 
(Inglehart, 1990). To resolve the inconsistency that happy people are more prone to sui- 
cide, it was proposed that suicide rates do not reflect the overall happiness but instead are 
affected by cultural norms. Those cultures in which suicide is most widespread tend to 
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have the strongest norms of describing oneself as happy. Conceivably, being deeply unhappy 
in a society where everybody is expected to be happy is even more unbearable than it would 
be in a society where misery is not so far from the norm (Inglehart, 1990, p. 245). Without an 
independent measure of cultural norms, this explanation must remain relegated to the status 
of plausible speculation. Still, this analysis serves as an example of further studies that could 
be stimulated by the results reported in this study. 

One final limitation that should be mentioned involves the representativeness of our 
ISDP samples. For some nations, ISDP samples included both college students and com- 
munity members. However, most nations were represented only by college student 
samples. This form of sampling can reduce the number of confounds across nations by 
restricting all participants to college-age individuals who have completed the equivalent of 
a grade school education. On the other hand, college student samples are unrepresentative 
of national populations, and the degree of this unrepresentativeness can vary across cul- 
tures. Indeed, African students from Botswana, the Democratic Republic of the Congo, 
Ethiopia, Morocco, South Africa, Tanzania, and Zimbabwe may constitute subportions of 
their cultures that are especially elite compared to college students in the United States and 
Western Europe. The same also may be true of some of our South American, Eastern 
European, Middle Eastern, and South and Southeast Asian samples. Future research in 
which truly representative samples from a wider range of cultures will help to more accu- 
rately document national trends and variations in personality dispositions. 

The limited focus of the current study leaves many potentially important questions 
unanswered. Future researchers may profit from relating the current trait profiles to other 
national variables, including additional socioeconomic indicators, beliefs, and values. 
Additional analyses looking to the natural grouping or clustering of specific Big Five traits 
and the reporting of trait-specific similarity indexes across nations are also of potential 
interest for the future. 


CONCLUDING REMARKS 


This study had three primary objectives. First, we examined whether the factor struc- 
ture of the English BFI fully generalized across diverse forms of human culture. As part of 
the ISDP, the BFI was translated into 29 languages and administered to samples from 56 
nations. We found that the five-dimensional structure of the BFI was highly replicable 
across all the major cultural regions of the world. Results also indicated that the factor 
scales possessed high levels of internal reliability across all cultures. 

The second objective was to evaluate the validity of nation-level BFI trait profiles. We 
found that BFI trait levels were reliably related to national profiles previously reported in 
the literature (e.g., from the NEO-PI-R), particularly when issues of sampling and acqui- 
escence are addressed. Importantly, these findings provided the first cross-cultural and 
cross-instrument validity evidence for the personality dimensions of Agreeableness, 
Conscientiousness, and Openness. We also found that nation-level personality profiles pro- 
vided by different Big Five measures converged in their relationships with key external cri- 
teria, such as sociosexuality and self-esteem. 

A third objective was to document the worldwide distribution of personality traits as mea- 
sured by the BFI. We found several patterns across cultures, including that people from the 
geographic regions of Africa and East Asia were significantly different in conscientiousness 
from those inhabiting other world regions, with the former being more conscientious and the 
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latter reporting less conscientiousness than people from other world regions. In sum, our ISDP 
findings, though limited in many ways, can be taken as an incremental addition to the grow- 
ing body of evidence that the Big Five dimensions of personality can be reliably measured 
across diverse human cultures. The BFI, in particular, may be especially useful for future 
researchers looking for a brief measure of basic personality traits. The BFI profiles generated 
by the ISDP may also prove useful as a baseline against which future large-scale studies of 
personality can be compared. 
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